29 research outputs found

    Dilithium for Memory Constrained Devices

    Get PDF
    We investigate the use of the Dilithium post-quantum digital signature scheme on memory-constrained systems. Reference and optimized implementations of Dilithium in the benchmarking framework pqm4 (Cortex-M4) require 50 – 100 KiB of memory, demonstrating the significant challenge to use Dilithium on small IoT platforms. We show that compressing polynomials, using an alternative number theoretic transform, and falling back to the schoolbook method for certain multiplications reduces the memory footprint significantly. This results in the first implementation of Dilithium for which the recommended parameter set requires less than 7 KiB of memory for key and signature generation and less than 3 KiB of memory for signature verification. We also provide benchmark details of a portable implementation in order to estimate the performance impact when using these memory reduction methods

    PQ.V.ALU.E: Post-Quantum RISC-V Custom ALU Extensions on Dilithium and Kyber

    Get PDF
    This paper explores the challenges and potential solutions of implementing the recommended upcoming post-quantum cryptography standards (the CRYSTALS-Dilithium and CRYSTALS-Kyber algorithms) on resource constrained devices. The high computational cost of polynomial operations, fundamental to cryptography based on ideal lattices, presents significant challenges in an efficient implementation. This paper proposes a hardware/software co-design strategy using RISC-V extensions to optimize resource utilization and speed up the number-theoretic transformations (NTTs). The primary contributions include a lightweight custom arithmetic logic unit (ALU), integrated into a 4-stage pipeline 32-bit RISC-V processor. This ALU is tailored towards the NTT computations and supports modular arithmetic as well as NTT butterfly operations. Furthermore, an extension to the RISC-V instruction set is introduced, with ten new instructions accessing the custom ALU to perform the necessary operations. The new instructions reduce the cycle count of the Kyber and Dilithium NTTs by more than 80% compared to optimized assembly, while being more lightweight than other works that exist in the literature

    Exploiting Small-Norm Polynomial Multiplication with Physical Attacks: Application to CRYSTALS-Dilithium

    Get PDF
    We present a set of physical attacks against CRYSTALS-Dilithium that accumulate noisy knowledge on secret keys over multiple signatures, finally leading to a full recovery attack. The methodology is composed of two steps. The first step consists of observing or inserting a bias in the posterior distribution of sensitive variables. The second step of an information processing phase which is based on belief propagation, which allows effectively exploiting that bias. The proposed concrete attacks rely on side-channel information, injection of fault attacks, or a combination of the two. Interestingly, the adversary benefits from the knowledge on the released signature, but is not dependent on it. We show that the combination of a physical attack with the binary knowledge of acceptance or rejection of a signature also leads to exploitable information on the secret key. Finally, we demonstrate that this approach is also effective against shuffled implementations of CRYSTALS-Dilithium

    Efficient compression of SIDH public keys

    Get PDF
    Supersingular isogeny Diffie-Hellman (SIDH) is an attractive candidate for post-quantum key exchange, in large part due to its relatively small public key sizes. A recent paper by Azarderakhsh, Jao, Kalach, Koziel and Leonardi showed that the public keys defined in Jao and De Feo\u27s original SIDH scheme can be further compressed by around a factor of two, but reported that the performance penalty in utilizing this compression blew the overall SIDH runtime out by more than an order of magnitude. Given that the runtime of SIDH key exchange is currently its main drawback in relation to its lattice- and code-based post-quantum alternatives, an order of magnitude performance penalty for a factor of two improvement in bandwidth presents a trade-off that is unlikely to favor public-key compression in many scenarios. In this paper, we propose a range of new algorithms and techniques that accelerate SIDH public-key compression by more than an order of magnitude, making it roughly as fast as a round of standalone SIDH key exchange, while further reducing the size of the compressed public keys by approximately 12.5%. These improvements enable the practical use of compression, achieving public keys of only 330 bytes for the concrete parameters used to target 128 bits of quantum security and further strengthens SIDH as a promising post-quantum primitive

    Enabling FrodoKEM on Embedded Devices

    Get PDF
    FrodoKEM is a lattice-based Key Encapsulation Mechanism (KEM) based on unstructured lattices. From a security point of view this makes it a conservative option to achieve post-quantum security, hence why it is favored by several European authorities (e.g., German BSI and French ANSSI). Relying on unstructured instead of structured lattices (e.g., CRYSTALS-Kyber) comes at the cost of additional memory usage, which is particularly critical for embedded security applications such as smart cards. For example, prior FrodoKEM-640 implementations (using AES) on Cortex-M4 require more than 80 kB of stack making it impossible to run on some embedded systems. In this work, we explore several stack reduction strategies and the resulting time versus memory trade-offs. Concretely, we reduce the stack consumption of FrodoKEM by a factor 2–3× compared to the smallest known implementations with almost no impact on performance. We also present various time-memory trade-offs going as low as 8 kB for all AES parameter sets, and below 4 kB for FrodoKEM-640. By introducing a minor tweak to the FrodoKEM specifications, we additionally reduce the stack consumption down to 8 kB for all the SHAKE versions. As a result, this work enables FrodoKEM on more resource constrained embedded systems

    From MLWE to RLWE: A Differential Fault Attack on Randomized & Deterministic Dilithium

    Get PDF
    The post-quantum digital signature scheme CRYSTALS-Dilithium has been recently selected by the NIST for standardization. Implementing CRYSTALS-Dilithium, and other post-quantum cryptography schemes, on embedded devices raises a new set of challenges, including ones related to performance in terms of speed and memory requirements, but also related to side-channel and fault injection attacks security. In this work, we investigated the latter and describe a differential fault attack on the randomized and deterministic versions of CRYSTALS-Dilithium. Notably, the attack requires a few instructions skips and is able to reduce the MLWE problem that Dilithium is based on to a smaller RLWE problem which can be practically solved with lattice reduction techniques. Accordingly, we demonstrated key recoveries using hints extracted on the secret keys from the same faulted signatures using the LWE with side-information framework introduced by Dachman-Soled et al. at CRYPTO’20. As a final contribution, we proposed algorithmic countermeasures against this attack and in particular showed that the second one can be parameterized to only induce a negligible overhead over the signature generation

    Protecting Dilithium against Leakage: Revisited Sensitivity Analysis and Improved Implementations

    Get PDF
    CRYSTALS-Dilithium has been selected by the NIST as the new stan- dard for post-quantum digital signatures. In this work, we revisit the side-channel countermeasures of Dilithium in three directions. First, we improve its sensitivity analysis by classifying intermediate computations according to their physical security requirements. Second, we provide improved gadgets dedicated to Dilithium, taking advantage of recent advances in masking conversion algorithms. Third, we combine these contributions and report performance for side-channel protected Dilithium implementations. Our benchmarking results additionally put forward that the ran- domized version of Dilithium can lead to significantly more efficient implementations (than its deterministic version) when side-channel attacks are a concern

    Set It and Forget It! Turnkey ECC for Instant Integration

    Get PDF
    Historically, Elliptic Curve Cryptography (ECC) is an active field of applied cryptography where recent focus is on high speed, constant time, and formally verified implementations. While there are a handful of outliers where all these concepts join and land in real-world deployments, these are generally on a case-by-case basis: e.g.\ a library may feature such X25519 or P-256 code, but not for all curves. In this work, we propose and implement a methodology that fully automates the implementation, testing, and integration of ECC stacks with the above properties. We demonstrate the flexibility and applicability of our methodology by seamlessly integrating into three real-world projects: OpenSSL, Mozilla's NSS, and the GOST OpenSSL Engine, achieving roughly 9.5x, 4.5x, 13.3x, and 3.7x speedup on any given curve for key generation, key agreement, signing, and verifying, respectively. Furthermore, we showcase the efficacy of our testing methodology by uncovering flaws and vulnerabilities in OpenSSL, and a specification-level vulnerability in a Russian standard. Our work bridges the gap between significant applied cryptography research results and deployed software, fully automating the process

    On Kummer Lines with Full Rational 2-torsion and Their Usage in Cryptography

    Get PDF
    Contains fulltext : 219228.pdf (publisher's version ) (Closed access) Contains fulltext : 219228.pdf (preprint version ) (Closed access

    Rapidly Verifiable XMSS Signatures

    No full text
    This work presents new speed records for XMSS (RFC 8391) signature verification on embedded devices. For this we make use of a probabilistic method recently proposed by Perin, Zambonin, Martins, Custódio, and Martina (PZMCM) at ISCC 2018, that changes the XMSS signing algorithm to search for rapidly verifiable signatures. We improve the method, ensuring that the added signing cost for the search is independent of the message length. We provide a statistical analysis of the resulting verification speed and support it by experiments. We present a record setting RFC compatible implementation of XMSS verification on the ARM Cortex-M4. At a signing time of about one minute on a general purpose CPU, we create signatures that are verified about 1.44 times faster than traditionally generated signatures. Adding further well-known implementation optimizations to the verification algorithm we reduce verification time by over a factor two from 13.85 million to 6.56 million cycles. In contrast to previous works, we provide a detailed security analysis of the resulting signature scheme under classical and quantum attacks that justifies our selection of parameters. On the way, we fill a gap in the security analysis of XMSS as described in RFC 8391 proving that the modified message hashing in the RFC does indeed mitigate multi-target attacks. This was not shown before and might be of independent interest
    corecore